A Windowing based GPU optimized strategy for the induction of Decision Trees in JaCa-DDM
نویسندگان
چکیده
When inducing Decision Trees, Windowing consists in selecting a random subset of the available training instances (the window) to induce a tree, and then enhance it by adding counter examples, i.e., instances not covered by the tree, to the window for inducing a new tree. The process iterates until all instances are well classified or no accuracy is gained. In favorable domains, the technique is known to speed up the induction process, and to enhance the accuracy of the induced tree; while reducing the number of training instances used. In this paper, a Windowing based strategy exploiting an optimized search of counter examples through the use of GPUs is introduced to cope with Distributed Data Mining (DDM) scenarios. The strategy is defined and implemented in JaCa-DDM, a novel system founded on the Agents & Artifacts paradigm. Our approach is well suited for DDM problems generating large amounts of training instances. Some experiments in diverse domains compare our strategy with the traditional centralized approach, including an exploratory case study on pixel-based segmentation for the detection of precancerous cervical lesions on colposcopic images.
منابع مشابه
An Agents and Artifacts Approach to Distributed Data Mining
This paper proposes a novel Distributed Data Mining (DDM) approach based on the Agents and Artifacts paradigm, as implemented in CArtAgO [9], where artifacts encapsulate data mining tools, inherited from Weka, that agents can use while engaged in collaborative, distributed learning processes. Target hypothesis are currently constrained to decision trees built with J48, but the approach is flexi...
متن کاملEvolutionary induction of a decision tree for large-scale data: a GPU-based approach
Evolutionary induction of decision trees is an emerging alternative to greedy top-down approaches. Its growing popularity results from good prediction performance and less complex output trees. However, one of the major drawbacks associated with the application of evolutionary algorithms is the tree induction time, especially for large-scale data. In the paper, we design and implement a graphic...
متن کاملStudy on Driving Decision-Making Mechanism of Autonomous Vehicle Based on an Optimized Support Vector Machine Regression
Driving Decision-making Mechanism (DDM) is identified as the key technology to ensure the driving safety of autonomous vehicle, which is mainly influenced by vehicle states and road conditions. However, previous studies have seldom considered road conditions and their coupled effects on driving decisions. Therefore, road conditions are introduced into DDM in this paper, and are based on a Suppo...
متن کاملDIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION
Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...
متن کاملDecentralized Routing and Power Allocation in FDMA Wireless Networks based on H∞ Fuzzy Control Strategy
Simultaneous routing and resource allocation has been considered in wireless networks for its performance improvement. In this paper we propose a cross-layer optimization framework for worst-case queue length minimization in some type of FDMA based wireless networks, in which the the data routing and the power allocation problem are jointly optimized with Fuzzy distributed H∞ control strategy ....
متن کامل